A Pseudo-Boolean Set Covering Machine

نویسندگان

  • Pascal Germain
  • Sébastien Giguère
  • Jean-Francis Roy
  • Brice Zirakiza
  • François Laviolette
  • Claude-Guy Quimper
چکیده

The Set Covering Machine (SCM) is a machine learning algorithm that constructs a conjunction of Boolean functions. This algorithm is motivated by the minimization of a theoretical bound. However, finding the optimal conjunction according to this bound is a combinatorial problem. The SCM approximates the solution using a greedy approach. Even though SCM seems very efficient in practice, it is unknown how it compares to the optimal solution. To answer this question, we present a novel pseudo-Boolean optimization model that encodes the minimization problem. It is the first time a Constraint Programming approach addresses the combinatorial problem related to this machine learning algorithm. Using that model and recent pseudo-Boolean solvers, we empirically show that the greedy approach is surprisingly close to the optimal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characteristic matrix of covering and its application to Boolean matrix decomposition

Covering-based rough sets provide an efficient theory to deal with covering data which widely exist in practical applications. Boolean matrix decomposition has been widely applied to data mining and machine learning. In this paper, three types of existing covering approximation operators are represented by boolean matrices, and then they are used to decompose into boolean matrices. First, we de...

متن کامل

Learning with the Set Covering Machine

We generalize the classical algorithms of Valiant and Haussler for learning conjunctions and disjunctions of Boolean attributes to the problem of learning these functions over arbitrary sets of features; including features that are constructed from the data. The result is a general-purpose learning machine, suitable for practical learning tasks, that we call the Set Covering Machine. We present...

متن کامل

Propagation Models and Fitting Them for the Boolean Random Sets

In order to study the relationship between random Boolean sets and some explanatory variables, this paper introduces a Propagation model. This model can be applied when corresponding Poisson process of the Boolean model is related to explanatory variables and the random grains are not affected by these variables. An approximation for the likelihood is used to find pseudo-maximum likelihood esti...

متن کامل

Persistency for higher-order pseudo-boolean maximization

A pseudo-Boolean function is a function from a 0/1-vector to the reals. Minimizing pseudo-Boolean functions is a very general problem with many applications. In image analysis, the problem arises in segmentation or as a subroutine in task like stero estimation and image denoising. Recent years have seen an increased interest in higher-degree problems, as opposed to quadratic pseudo-Boolean func...

متن کامل

The Set Covering Machine

We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a generalpurpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012